3,389 research outputs found

    Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

    Full text link
    Distribution shifts are a serious concern in modern statistical learning as they can systematically change the properties of the data away from the truth. We focus on Wasserstein distribution shifts, where every data point may undergo a slight perturbation, as opposed to the Huber contamination model where a fraction of observations are outliers. We consider perturbations that are either independent or coordinated joint shifts across data points. We analyze several important statistical problems, including location estimation, linear regression, and non-parametric density estimation. Under a squared loss for mean estimation and prediction error in linear regression, we find the exact minimax risk, a least favorable perturbation, and show that the sample mean and least squares estimators are respectively optimal. For other problems, we provide nearly optimal estimators and precise finite-sample bounds. We also introduce several tools for bounding the minimax risk under general distribution shifts, not just for Wasserstein perturbations, such as a smoothing technique for location families, and generalizations of classical tools including least favorable sequences of priors, the modulus of continuity, as well as Le Cam's, Fano's, and Assouad's methods.Comment: 60 pages, 7 figure

    Multi-Span Acoustic Modelling using Raw Waveform Signals

    Full text link
    Traditional automatic speech recognition (ASR) systems often use an acoustic model (AM) built on handcrafted acoustic features, such as log Mel-filter bank (FBANK) values. Recent studies found that AMs with convolutional neural networks (CNNs) can directly use the raw waveform signal as input. Given sufficient training data, these AMs can yield a competitive word error rate (WER) to those built on FBANK features. This paper proposes a novel multi-span structure for acoustic modelling based on the raw waveform with multiple streams of CNN input layers, each processing a different span of the raw waveform signal. Evaluation on both the single channel CHiME4 and AMI data sets show that multi-span AMs give a lower WER than FBANK AMs by an average of about 5% (relative). Analysis of the trained multi-span model reveals that the CNNs can learn filters that are rather different to the log Mel filters. Furthermore, the paper shows that a widely used single span raw waveform AM can be improved by using a smaller CNN kernel size and increased stride to yield improved WERs.Comment: To appear in INTERSPEECH 201

    In-Situ and Remote Sensing of the Environment Using KHawk UASs

    Get PDF
    This presentation was given as part of the GIS Day@KU symposium on November 16, 2016. For more information about GIS Day@KU activities, please see http://gis.ku.edu/gisday/2016/.Platinum Sponsors: KU Department of Geography and Atmospheric Science. Gold Sponsors: Enertech, KU Environmental Studies Program, KU Libraries. Silver Sponsors: Douglas County, Kansas, KansasView, State of Kansas Data Access & Support Center (DASC) and the KU Center for Global and International Studies

    Black Box Adversarial Prompting for Foundation Models

    Full text link
    Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or generating high perplexity text
    • …
    corecore